Using Provenance to Streamline Data Exploration through Visualization

نویسندگان

  • Steven P. Callahan
  • Juliana Freire
  • Emanuele Santos
  • Carlos E. Scheidegger
  • Cláudio T. Silva
  • Huy T. Vo
چکیده

Scientists are faced with increasingly larger volumes of data to analyze. To analyze and validate various hypotheses, they need to create insightful visual representations of both observed data and simulated processes. Often, insight comes from comparing multiple visualizations. But data exploration through visualization requires scientists to assemble complex workflows—pipelines consisting of sequences of operations that transform the data into appropriate visual representations—and today, this process contains many error-prone and time-consuming tasks. We show how a new action-based model for capturing and maintaining detailed provenance of the visualization process can be used to streamline the data exploration process and reduce the time to insight. This model enables the flexible re-use of workflows, a scalable mechanism for creating a large number of visualizations, and collaboration in a distributed setting. A novel feature of this model is that it uniformly captures provenance information for both visualization data products and workflows used to generate these products. By also tracking the evolution of workflows, it not only ensures reproducibility, but also allows scientists to easily navigate through the space of workflows and parameter settings used in a given exploration task. We describe the implementation of this data exploration infrastructure in the VisTrails system, and present two case studies which show how it greatly simplifies the scientific discovery process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visualization in Radiation Oncology: Towards Replacing the Laboratory Notebook

Data exploration in radiation oncology requires the creation of a large number of visualizations. For treatment planning, detailed information about the processes used to manipulate data collected and to create visualizations is needed for assessing the quality of the results. Current visualization systems allow the interactive creation and manipulation of complex visualizations. However, they ...

متن کامل

A Fully Integrated Approach for Better Determination of Fracture Parameters Using Streamline Simulation; A gas condensate reservoir case study in Iran

      Many large oil and gas fields in the most productive world regions happen to be fractured. The exploration and development of these reservoirs is a true challenge for many operators. These difficulties are due to uncertainties in geological fracture properties such as aperture, length, connectivity and intensity distribution. To successfully address these challenges, it is paramount to im...

متن کامل

Provenance-Based Visual Data Exploration with EVLIN

Tools for visual data exploration allow users to visually browse through and analyze datasets to possibly reveal interesting information hidden in the data that users are a priori unaware of. Such tools rely on both query recommendations to select data to be visualized and visualization recommendations for these data to best support users in their visual data exploration process. EVLIN (explori...

متن کامل

Interactive view-driven evenly spaced streamline placement

This paper presents an Interactive View-Driven Evenly Spaced Streamline placement algorithm (IVDESS) for 3D explorative visualization of large complex planar or curved surface flows. IVDESS rapidly performs accurate streamline integration in 3D physical space, i.e., the flow field, while achieving high quality streamline density control in 2D view space, i.e., the output image. The corresponden...

متن کامل

Provenance for Visualizations Reproducibility and beyond Visualization Systems

C omputing has been an enormous accelerator for science, leading to an information explosion in many different fields. Future scientific advances depend on our ability to comprehend the vast amounts of data currently being produced and acquired. To analyze and understand this data, though, we must assemble complex computational processes and generate insightful visualizations, which often requi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006